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Abstract — This paper investigates a lossy source coding prob- 
lem in which two decoders can access their side-information re- 
spectively. The correlated sources are a product of two component 
correlated sources, and we exclusively investigate the case such 
that each component is degraded. We show the rate-distortion 
function for that case, and give the following observations. 
When the components are degraded in matched order, the rate 
distortion function of the product sources is equal to the sum of 
the component-wise rate distortion functions. On the otherhand, 
the former is strictly smaller than the latter when the component 
sources are degraded in mismatched order. 

I. Introduction 

The source coding problem for correlated sources has been 
regarded as an important research area in information theory, 
and various types of coding problems were studied so far 
(e.g. [1], ia, 11, [41, [5|). In particular, our focus in this paper 
is the lossy coding problem posed by Heegard and Berger ||6]. 

In the problem, there are one encoder and multiple decoders 
(see Fig. [T]|. In this paper, we only treat the case with two 
decoders. The encoder send an encoded version of principal 
source X, and decoders reproduce the principal source within 
prescribed distortion levels by the help of side-information Y 
and Z respectively. 

In this setting, Heegard and Berger showed an upper bound 
on the rate distortion function. They also showed that the upper 
bound is tight if the sources are degraded, i.e., X, Y, and Z 
form a Markov chain in this order Although some researchers 
investigated variants of Heegard and Berger's problem Q, lO, 
there is no conclusive result, i.e., an upper bound and a lower 
bound coincide, without the degraded assumption, and whether 
Heegard and Berger's upper bound is tight or not for non- 
degraded cases has been a long-standing open problem. 

In order to provide some insight to this problem, we 
investigate a special case of Heegard and Berger's problem 
in this paper. Specifically, we consider the case such that 
the correlated sources {X, Y, Z) is a direct product of two 
components correlated sources {Xi,Yi, Zi) and {X2,Y2, Z2) 
and the components are independent each other (see Fig. [T]i. 
Furthermore, we exclusively consider the case such that each 
component is degraded, i.e., either Eq. ([T} or Eq. (|3) is 
satisfied. 

Although the problem setting treated in this paper seems too 
restrictive at first glance, this is not the case. The rate distortion 
function of product sources can be strictly smaller than the 
summation of the component-wise rate distortion functions 



even though the components are independent each other, an 
intuitive example of which is illustrated in Fig. |2] Therefore, 
the rate distortion function of product sources is not trivial, and 
it is interesting to characterize the rate distortion function. 

When Eq. ([U is satisfied, the joint sources {X, Y, Z) are 
degraded. Thus, Heegard and Berger's result suggests that 
their upper bound is tight. On the otherhand, when Eq. (|3) 
is satisfied, the joint sources are not degraded. Thus, whether 
Heegard and Berger's upper bound is tight or not is unclear 
so far In this paper, we show that the upper bound is tight, 
i.e., we characterize the rate-distortion function. As long as 
the author's knowledge, this is the first example such that the 
rate-distortion function is characterized without the degraded 
assumption. 

The rest of the paper is organized as follows. In Section |ll] 
we explain the problem setting treated in this paper, and also 
explain known results. In Section [nil we show our main result 
and its proof. In Section |IV] we show the Gaussian example. 
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Fig. 1. The coding system investigated in tliis paper. 

II. Preliminaries 

In this section, we explain the problem formulation treated 
in this paper, which is a special case of the problem formula- 
tion treated by Heegard and Berger in (]6]. Then, we explain 
some known results. 

Let iX,Y,Z) = ((Xi,X2),(Yi,y2),(^i,^2)) be prod- 
uct of correlated sources, i.e., components {Xi,Yi, Zi) and 
{X2, Y2, Z2) are independent each other The alphabet of the 
sources are denoted hy X ^ Xi x X2, y — yi x y2, and 
Z = Zi XZ2 respectively. Let (X", F", Z") be independently 
identical distributed copies of (X, Y, Z). 
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Fig. 2. An intuitive example sucli that the rate-distortion function (with 
distortion 0) for the product source is strictly smaller than the summation 
of the component-wise rate distortion functions. Xi and X2 are independent 
unifomi binary random variables, and ± represent a constant random variable. 
When two components ai'e encoded and decoded separately, 1 bit must be sent 
for each components, which means 2 bits must be sent to reproduce {Xi , X2 ) 
at both decoders. On the otherhand, if the encoder sends Xi ©X2, then both 
decoders can reproduce (Xi,X2) as in the network coding |9|. Thus, when 
the components are encoded and decoded jointly, 1 bit suffice for the decoders 
to reproduce {Xi, X2 ) . This example can be also regai'ded as a special case 
of the complementary delivery network |10| , 1111 . and the relation between 
the complementary delivery network and the network coding was pointed out 
in [12. 



Let Xi, X2, Xi, and X2 be reproduction alphabets, and for 
i = 1,2 let 

di : Xi~K Xi [0, 00), 

di : XiX Xi [0, 00) 

be distortion measures. The coding system treated in this paper 
consists of one encoder 

(^„:A'"-^{1,...,M„} 

and two decoders 

0„:{i,...,Af„}x3;"^;ei"xi'2" 

and 

^'„:{l,...,M„}xZ"^A'i"xA'2". 

For quadruplet D — {Di, D2, Di, D2), rate R is said to be 
£>-achievable if, for any 7 > 0, there exists a sequence of 
codes {{(pn,(l>n,ipn)}^=i such that 



-logM„ < i? + 7 



and 



1 " 

- VE[d,(X,t,X,,)] < A +7, 
n ^-^ 

t=i 

1 " 

-Y,^MX,uXu)] < A + 7 
" t=i 

for i = 1,2 are satisfied for sufficiently large n, 
where (Ir,^?) = MVn{X"),Y") and {X^^X^) = 
4'7i{'Pn{X") , Z"). Then, the rate-distortion function is defined 
as 

R{D) inf{i? ; R is D-achievable}. 



In 161, Heegard and Berger showed an upper bound on the 
rate-distortion function. 

Proposition 1 (iH Theorem 2]) Let (W, IJ, U) be auxiUary 
random variables satisfying 

1) {w, u,u) ^x ^ (r, z). 

2) There exist functions X^{W,U,Y) and X'^(W,U,Z) 
such that E[d,{X„Xl)] < D, and E[d,{X„X^)] < D, 
for i = 1,2. 

Then, we have 

R{D) < max{I{W;X\Y),I{W;X\Z)} 

+ I{U;X\Y, W) + I{U;X\Z, W). 

Remark 2 In f6l, Heegard and Berger also showed an upper 
bound on the rate-distortion function for more than three 
decoders. However, Timo et. al. pointed out that the statement 
of fS^ Theorem 2] for more than three decoders is invalid, and 
only the statement for two decoders is valid |12|. In I.12J . they 
also showed a corrected upper bound on the rate-distortion 
function for more than three decoders. 

When the component sources are degraded in matched 
order, i.e.. 



Xi o Yi o Zi, 

X2^Y2^ Z2 



(1) 



are satisfied, then the joint sources (X, F, Z) are degraded, 
i.e., 

X ^Y ^ Z. 

For the degraded sources, Heegard and Berger ^ showed 
that the upper bound in Proposition [T] is tight. In particular for 
product of two sources, we have the following statement. 

Proposition 3 (IS)) If the components sources are degraded 
in matched order, i.e., Eq. ^ is satisfied, then we have 

R{D) = R*{D) 

:= min[/(W^i;Xi|Zi) + /([/i; Xi|Yi, Wi) 
+I{W2;X2\Z2) + I{U2;X2\Y2,W2)], 

where the minimization is taken over all auxiliary random 
variables Wi , W2 , t/i , t/2 satisfying the followings: 

1) (W„ [/,) o X, ^ (y„ Z,) for i = 1, 2. 

2) (Wi, Ui) and {W2, U2) are independent each other 

3) There exist functions Xi{Wi,Ut,Yi) and Xi{Wt,Zi) 
such that E[d,{X„X,)] < A and E[d^{X^,X,)] < D, 
for i = 1,2. 

4) \W^\ < \X,\ + 2 and \U,\ < {\X.,\ + 1)^ for i = 1,2, 
where and Ui are ranges of Wi and Ui respectively. 

Remark 4 Technically, the result in |j6l does not directly 
imply Proposition |3] because Proposition |3] states the stronger 
condition on the auxirially random variables, i.e., {Ui,Wi) 



and {1/2, W2) are independent. We give a proof of Proposition 
[3] in Appendix lAl for readers' convenience. 

Note that R*{D) is nothing but the summation of the 
component-wise rate distortion functions, i.e., 

R*{D) = RliDi, D2) + R;{D2, D2), 

where 

RUDr,D,) = mm[I{Wf,X,\Z,) + I{U,;X,\Y„W,)] (2) 

and the minimization in Eq. (|2]l is taken over all {Ui,Wi) 
satisfying the conditions 1, 3, and 4 in Proposition |3] This 
fact implies that the optimal scheme for the degraded product 
sources is to combine the component-wise optimal scheme. 

When sources {X, Y, Z) are not necessarily degraded, 
whether the upper bound in Proposition [T] is tight or not has 
been an open problem for a long time. In the next section, 
we will show that the upper bound is tight if the component 
sources satisfy Eq. (|3]l. 

III. Main Results 

A. Statements of results 

In this section, we consider the case in which the component 
sources are degraded in mismatched order, i.e., 

Xi o Yi ^ Zi, 

(3) 

o Z2 o Y2 

are satisfied. In this case, the joint sources {X, Y, Z) are not 
degraded, and the rate-distortion function R{D) has not been 
clarified. The following is our main result. 

Theorem 5 Suppose that Eq. ^ is satisfied. Then, we have 

R{D) = R*{D) 

:= mm[nmx{I{Wi;Xi\Yi) + /(T4^2; ^2|i^2), 
I{Wi;Xi\Zi) + I{W2;X2\Z2)} 
+ I{Ui;Xi\Yi,Wi) + I{U2; X2 \ Z2 , W^a)] , 

where the minimization is taken over all auxiliary random 
variables Wi, W2, Ui, U2 satisfying the followings: 

1) (W„ U,) O X, o (r„ Z,) for i - 1, 2. 

2) (Wi, Ui) and (W2, U2) are independent each other 

3) There exist functions Xi(Wi,Ui,Yi), X2{W2,Y2), 
Xi{Wi, Zi), and X2{W2, 1/2,^2) such that 

E[d,iX,,X,)] < A 

and 

E[d,iX„X,)] < A 

for i = 1,2. 

4) \W^\ < \X^\ + 2 and < {\X^\ + 1)^ for i = 1,2, 
where Wi and Ui are ranges of Wi and Ui respectively. 

When the distortion levels are all 0, we have the following 
corollary, which can be also derived as a straightforward 
consequence of Sgarro's result [13 1. 



Corollary 6 When 0i, f)2, Di, D2) = = (0,0,0,0), we 
have 

R{0) = max{ff(Xi|yi) + iJ(X2|r2), 
HiXi\Zi) + HiX2\Z2)}. 

Remark 7 It should be noted that 

max{iJ(Xi|ri) + H{X2\Y2),H{Xi\Zi) + H{X2\Z2)} 
< max{i/(Xi|ri),ir(Xi|Zi)} 

+ m!ix{H{X2\Y2),H{X2\Z2)} 

in general. This fact implies that the combination of the 
component-wise optimal scheme is not necessarily optimal 
even though the components are independent each other This 
phenomenon also appears for a lossy case, which will be 
mentioned in Section |IV] by a Gaussian example. 

Remark 8 In lfT4l . Tian and Diggavi proposed a coding 
scheme that is different from ||6l. Although joint encoding 
and decoding is required to achieve the rate-distortion function 
given in Theorem |5] we can construct a code that achieve the 
rate-distortion function from component-wise coding scheme 
of ITHI in a similar manner as the example of Fig. |2] 

When we apply the coding scheme of lil4J to the first 
component source {Xi,Yi, Zi), the source X" is quantized 
into the common description W" and the private description 
Ui. Then, the common description W" is first bin coded at 
rate 

/(VKi; Xi|Yi) = /(VFi; Xi) - /(VFi; Fi), 
and then VF" is bin coded again at extra rate 

; Fi I Zi ) = /( W^i ; Yi ) - /( VKi ; Zi ) . 

By using the first bin index /i, the first decoder (with Yj") 
can reconstruct the common description VF". By using both 
the first bin index Ii and the extra bin index I2, the second 
decoder (with Z") can reconstruct W". After that the private 
description [/" is transmitted to the first decoder at rate 

I{Ui;Xi\Yi,Wi). 

Similarly, when we apply the coding scheme of lfT4l to 
the second component source {X2, Y2, Z2), the source X2 is 
quantized into the common description W2 and the private 
description U2 - Then, the common description W2 is bin 
coded into Ji and J2 at rate 

IiW2;X2\Z2)^IiW2;X2)-I{W2;Z2) 

and 

I{W2; Z2\Y2) = I{W2; Z2) - /(W^2; Y2) 

respectively so that the first decoder (with Y2 ) can reconstruct 
W2 from both Ji and J2 and the second decoder (with Zj ) 
can reconstruct from Ji. The private description U2 is 
also transmitted to the second decoder at rate 

I{U2]X2\Z2,W2). 



By using the above two component-wise coding scheme, 
we can construct a joint encoding and decoding scheme as 
follows. First, the encoder sends (/i, Ji, l2®J2)- This requires 
the rate 

I(Wi;Xi\Yi)+I{W2;X2\Z2) 

+ max[/( VFi ; Fi I Zi ) , /( W^2 ; ^2 1 >2 )] . 

Note that the first (second) decoder can obtain J2 (h) by 
first reconstructing (W2) and then subtracting I2 (J2) 
from I2® J2- The encoder also sends the private descriptions 

and at rates I{Ui]Xi\Yi,Wi) and /(C/2; ^2|^2, 
respectively. Consequently, the total rate coincide with the 
rate-distortion function given in Theorem |5] 

B. Proof of Theorem |5] 

1 ) Direct Part: The direct part is a straightforward conse- 
quence of Proposition [T] For any auxiliary random variables 
(Wi, W2, Ui,U2) satisfying the conditions in Theorem |5] let 



W 

U 

u 

X[iW,U,Y) 

x;,iw,u,Y) 

X[{W,U,Z) 
X!,iW,U,Z) 



{Wi,W2), 

U2, 

Xi{Wi,Ui,Yi) 

X2{W2,Y2), 

X,{Wi,Zi), 

X2{W2,U2,Z2) 



Then, Proposition [T] implies Theorem |5] □ 
2) Converse Part: As we have mentioned in Section [III 
Heegard and Berger showed the converse coding theorem for 
degraded case. In the course of the proof, they essentially 
showed the following lemma, which can be shown only for 
the degraded case. Although our purpose is to show the 
converse coding theorem for the non-degraded case, we need 
the following lemma in our converse proof of Theorem |5] A 
proof of Lemma |9] is given in Appendix |B] 

Lemma9Let {A,B,C) = ((Ai, A2), (Bi, B2), (Ci, C2)) 
be product of correlated sources such that 



o i?j o C\ 



(4) 



for both i = 1 and i = 2. Let (A", B", C") be independently 
identical distributed copies of {A,B,C). Then, for any func- 
tion r„ = /n(^")' we have 



It should be noted that the bounds in Lemma |9] do not involve 
the decoders, which is important when we use Lemma |9] in 
the converse proof below. 
Proof of Converse Part) 

Suppose that the rate R is £)-achievable. Then, for any 
7 > there exists a code {ipn, 0n, V'n) such that 



-H{Sn) < -logMn <i? + 7 

n n 



and 



1 

n ^-^ 

1 " 



< A + 7, 

< A + 7 



(5) 

(6) 
(7) 



for i = 1,2 are satisfied, where we set Sn = 'Pn{X'^), Du = 
¥.[d,{X,uX^^)] and At = E[d~ X.'J)] for {X^,X1^) = 
0„(</,„(X"),r") and {X^,XJ^)= M^n{X''),Z^). 

The key idea of the proof is to derive two lower bound on 
H{Sn) by using Lemma|9]as follows. First, let {Ai, Bi, Ci) — 
{Xi,Yi,Zi) and (A2, 52,(^2) = (^2,^2,^2). Then, since 
{A, B, C) satisfies Eq. (|4]i, we can use Lemma |9] and we 
have 



1 



-HiSn) 



> 



1 " 

~ E [^('5'"' ^i7i ^1*7 Zit,Z2;Xit\Zit) 



i = l 



(8) 



+I{Yu',Xit\Yit,Sn,Yj^f. , Zjj, Z^, Z2) 

+I{Sn, Y^, Z", Zjt, Z^t, X2t\Z2t)\ 

1 " 

~ E [^('^"' -^it ' -^2"' ^iv ^iv Z2]Xit\Zit) 
t=i 

+I{Y^;Xit\Yit, Sn,Y^^ , Y2, Zji, Z^, Z2) 

+I{Sn, Yli ^2t ' ^2^' ^2t' ^2*; ^2t|^2t)] 

1 " 

-Y^[I{Wu]Xu\Zu) + I{UiuXu\Yu,Wit) 
t=i 

+I{W2t,U2uX2t\Z2t)] 

I{Wit;Xit\Zit,T) + I{Uit;Xit\Yit,Wit,T) 

+I{W2T, U2T'; X2t\Z2T, T) 

/(Wit, T; Xit\Zit) + I{Uit] ^itI^t, Wit. T) 

+I{W2T,T,U2T;X2t\Z2t) (9) 



n 

t=i 


Eq. dSJl, and we set 




+I{Btt^ i?2 ; Ait\Bit,Tn, B^^, C^^, Cj^, C2 ) 


Uit = 


Y+ 


+I{Tn, i?", i?2f, C*", 6*24, 6*24; A2t\C2t) 


W2t = 


('5'n, 


+I{B^^; A2t\B2t,Tn, Bl\ 534, C", C2t,C^t)] ' 


U2t = 


7+ 



where we use the notations B^ 



(B 



i(t+i)i ■ 



. , Bin), and etc. 



(i?ii, • ■ • , ^ 



and T is the uniform random numbers on {!,..., n} that 
are independent of the other random variables. Note that 



WiuUiuW2uU2t satisfy {W,t,U,t) ^ Xu O {Y,uZ,t) for 
i = 1,2. 

Similarly, let (Ai,Bi,Ci) = (^2,^2,>^2) and 
(A2, 52,6*2) = (Xi,Yi,Yi). Then, since (A, B, C) satisfies 
Eq. (|4|i, we can use Lemma |9] and we have 

-H{Sn) 



> 



1 

-E 



+I{^tt'^ X2t\Z2t, Sn, Y^, Y^f. , ^2^7 ^2t) 

+I{Sn,Y-^^^ , Y{^,Y2, Z2;Xit\Yit)] 
1 



[l{Sn,Y{',Y,-,Y+,Zl\ Z^,- X2t\Y2t) 



+-^(■^2*; X2t\Z2tiSn, Y^, Y21 , Y^^,Z^-l, Z21) 

+I{Sn,Y,-,Y+,Yi\ Z^,,Z+,Z^i-Xu\Yu)\ (10) 
1 " 

= - V[/(T4^2t;^2t|>'2t)+^(C^2t;^2*|^2t,M^2t) 

t=i 

+I{Wit,Uit;Xit\Yu)], 

= /(Ty2T;^2T|r2T,T) 

+I{U2T', X2t\Z2T, W2T, T) 

+I{Wit.Uit-,Xit\Yit.T) 
= I(W2t,T;X2t\Y2t) 

+I{U2T', X2t\Z2T, W2T, T) 

+I{Wit,T,Uit]Xit\Yit). (11) 

where we used the fact that Zi is degraded ver- 
sion of Yi in Eq. ( fTOl l. Since (VFk, C/k, Fk) includes 

{Sn,Y{',Y^), there exists a function Xit{Wit,Uit,Yit) sat- 
isfying E[di(Xi,lit)] = i^it. Since (W^2t,>"2t) includes 
{Sn,Yi,Y2), there exists a function X2t{W2t,Y2t) satis- 
fying E[d2(^2,^2t)] = ^2*. Since includes 
(5„,Z",Z2), there exists a function Xit{Wit, Zu) satisfy- 
ing E[diiXi,Xu)] = Du- Since {W2t,U2t, Z2t) includes 
(5„,Z",Z2), there exists a function X2t(W2t, t^2t, .^2t) sat- 
isfying £[^2(^2, -^^2*)] — D2t- Thus, there exist functions 

Xi(TfiT,T,f/iT,riT), X2(M^2T,T,y2T), Xi{WlT ,T , Zit), 

and X2{W2T, T, U2T, ■^2t) satisfying 

E[d(X„X,)] < A +7 (12) 

E[J(X„1,)] < A +7 (13) 

for i — 1,2. Thus, by combining Eqs. Q, and (fTTT l. and 
by taking Wi = (Wit,T), Ui = Uit, W2 = (W2t,T), and 
U2 = U2T, we have that there exist Wi, W2, Ui, U2 satisfying 
Eqs. ([T2I1 and ^ and 

R{D) > I{Wi;Xi\Zi) + I{Ui;Xi\Yi,Wi) 

+I{W2,U2;X2\Z2), 

R{D) > I{Wi,Ui;Xi\Yi) 

+I{W2;X2\Y2)+I{U2;X2\Z2,W2). 

Although the auxirially random variables {Wi,Ui) and 
(W2,C^2) chosen above are not independent each other, they 



never appear in any term simultaneously. Thus we can take 
{Wi,Ui) and (W2,C^2) to be independent each other. Since 
7 > is arbitrary, by the continuity of R*{D) with respect to 
D, and by using the support lemma 21, we have the statement 
of the theorem. □ 

IV. Gaussian Case 

To illustrate our main result, we show an explicit form of the 
rate distortion function for the Gaussian example. We consider 
jointly Gaussian sources (X^, Y^, Zi) given by Yi = Xi + Ni^y 
and Zi = Xi + Ni^z- where Ni^y and Ni^z are Gaussian noises 
with variances S^^Af and S^^at^ such that Si_jv < Si jv, and 
^2,N^ < ^2,N respectively. The conditional variance of Xi 
given Yi is denoted by ^i^x\y etc.. To avoid tedious exceptional 
cases, we assume that Di < Hi j.\y and Di < for i = 

1,2. 

In the above setting, the rate-distortion function is given by 
the following theorem. 

Theorem 10 We have 
R{D) = R*{D) 

+ ^^Og^2,x\yD2^, 
il0gEl^,|,A"l + i log £2^,1, (A"' - H2'Ny + ^2k) 



■log 



(A* 



) 



1 {b^ + j:2,n.] 

77 log ■ 



2 (A-1 - E-\ 



2,Af„ 



^2.N^ 



where 



A = max[A"i-E-]v^,A-i-S-jvJ, 



= max[A ^ - E2,7v„, A ^ 



^2,N^ 



Note that the component-wise rate-distortion functions are 
given by 



i?t(A,A) 



i?;(A,A) 



ilogSi,,,,^-! 

(A* 



4 10, 



2 '^(A-'-sr.k + s^k)' 



■l0gS2,:,|„A ^ 



1 



■log- 



(A + S2.L) 



2 (A ^ - S2,]v„ + ^2,^) 

By noting 'Hi,Ny < '^i,n^ and 'H2,Ny > H2.N., we can find 

A(D) < A(A,A) + i?2(A,A), 

which implies that the combination of the component-wise 
optimal scheme is suboptimal for Gaussian product sources. 



V. Conclusion 

In this paper, we investigated the lossy coding problem for 
a product of two sources with two decoders, and characterized 
the rate-distortion function. 

It should be noted that the present work is motivated by 
the results on product of two broadcast channels [l51, fl6\. 
The technique used in our converse proof can be regarded as 
the enhance technique introduced by Weingarten et. al. ifTTl . 
Our future research agenda is to extend our result to vector 
Gaussian sources in which there exist correlation between 
component sources. 
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Appendix 

A. Proof of Proposition [3 

Since the direct part directly follows from Proposition 
[H we only prove the converse part. Suppose that R is 
iD-achievable. Then, for any 7 > 0, there exists a code 
{(Pn,4'n,'4'n) Satisfying Eqs. Q-©, where we use the same 
notation as in Section IIII-B2I We will lower bound H{Sn) 
by using Lemma |9l Let {Ai,Bi,Ci) = {Xi,Yi,Zi) and 
{A2, i?2j ^^2) = {^2, Y2, Z2)- Then, from Lemma |9] we have 

n 

1 " 

> - / , [HSn,Y^^ , Zjj, Zjj, Z2; Xit\Zit) 
t=i 

+ -^(^1! J ^2" ; -'^It I ^It J 'S'n , , ^■jt , ■Z'u , Z2 ) 
+ /(S'„, y", , Z", ^24, Z2t',X.2t\Z2t) 

^i^2t 1 ^2t I Y2t , Sn , y" , Fji 1 ^" 1 ^2t ! ^2t )] 

1 " 

= -y [I{WiuXu\Zu) + I{Uit;Xu\Yu,Wu) 
n ^ — ' 
t=i 

+IiW2t;X2t\Z2t) + I{U2t;X2t\Y2U W2t)] 

= I{Wit;Xit\Zit, T) + /([/it; X^t\Yit, Wit. T) 

+I{W2T] X2t\Z2T. T) + I{U2T] X2t\Y2T . W^2T, T) 

= I{WiT,T; Xit\Zit) + I{Uit; ^itI^t, Wit, T) 

+I{W2T, T; X2t\Zit) + I{U2T; X2t\Y2T, W2T, T), 

where we set 

Wit = {Sn,YJ^^,Z-^^^,Z^^,Z2), 
Un = {Y+,Y2n, 

W2t = {Sn, Y^ , 1^24 ,Zi,Z2t, ^2t ) i 
U2t = Y^, 

and T is the uniform random number on {!,..., n} that 
are independent of the other random variables. Note that 



Wu,Uit,W2t,U2t satisfy {W,t,U,t) o X,t O iYu,Zu) for 
i = 1,2. 

In a similar reason as in Section IIII-B2I there exist func- 
tions X,{WiT,T,U,T,Y,T) and X^{WiT ,T, ZiT) satisfying 
Eqs. dull and ^ for i = 1,2. Thus, by taking Wi = 
{WiT,T), Ui = UiT, W2 = iW2T,T), and U2 ^ U2T, we 
have that there exist Wi, W2, Ui,U2 satisfying Eqs. (fT2l l and 
O and 

R{D) > I{Wi;Xi\Zi) + I{Ui;Xi\Yi,Wi) 

+I{W2;X2\Z2)+I{U2;X2\Y2,W2). 

Although the auxirially random variables {Wi,Ui) and 
(14^2, f^2) chosen above are not independent each other, they 
never appear in any term simultaneously. Thus we can take 
(WijUi) and (W2,C^2) to be independent each other Since 
7 > is arbitrary, by the continuity of R*{D) with respect to 
D, and by using the support lemma ID, we have the statement 
of the theorem. □ 

B. Proof of Lemma \9\ 

First, by chain rules, we have 

H{T„) 

> /(r„;A"|C") 

= /(r„,B";^"|C") ~/(B";A"|r„,C") 

n 

= J2 [l{Tn,B^,B^^;Au\A^t.C^,C^') 
t=i 

-I{Bit;Ai, ^2 |T„, C", C^) 
+/(r„, B{\Bl,A2t\A^, A2t,Cl\C^) 

-i{B2u A", A2 |r„, s", i?2t, c*", C2 )] ■ 

Since {Ait,Cit) and (Aj^^, Cj^, Cj^, C2 ) are independent, we 
have 

I(Tn,Bi, B2; Ait\A^^, C", 6*2 ) 
= liTn, ^it, B", B2, Cn, C^f., C2 ; Ait\Cit) 

> I{Tn,Bi,B2,Cif,C^^,C2;Ait\Cit)- 

Similarly, since {A2t,C2t) and {A'^,A2t,C^,C2t,C^t) 
independent, we have 

I{T„,B'^^Bl,A2t\A^l,A2t,Cl\C'2') 

> I{Tn, -B", -B2 , C*", C2t,C2t', A2t\C2t)- 

Furthermore, since the Markov chains 

Bit ^ {Ait,Cit) o (T„,yl[j, A^j, A2,i?ij,Cif,Cj^,C^) 
and 

-821 ^ (^2t, C'2t) O (T„, A", ^24,^^4, i?", i?2t, C*", C2t,C2t) 

hold, we have 

I {Bit; Al , A2 |T„ , B^^ , C" , ) 
= I{Bit;Ait\Tn,B^^,Ci,C2) 



and 
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Thus, we have 

71 

> J2 [^(^" ' , ^2 , C-ft , C+ , C2" ; Alt I Cit ) 

— /(Bif ; Ait|r„, C", C2 ) 
+/(r„, i?", , C", C2t,C2t; A2t\C2t) 
-I{B2uA2t\T„,B'^,B2t,C^,C^)] ■ (14) 
By chain rules, we have 

I{Tn, Bl\ , Cjt, Ci'i, C2 ; Ait|Cit) 
-/(Bif ; Ait|T„, Bjj, C", C2 ) 
= ^(T'n, -Bi(, C;^(, Cj^, C^; AitjCit) 
+/(i3if ; Ait|T„, _Bj(, C", C2 ) 

^2 ; ^it l^ii: T'n, , C", CJ-) 
—I{Bit; Ait\Tn, Bfj, C", C2 ) 
= /(Tn, Sft, Cff, C]^, C^; Ait|Cit) 

i32 ; Alt T'n, -Bft J C*", C^). (15) 

Similarly, we have 

I{Tn, Bi, , C", C2t,C2t; A2t\C2t) 

-I{B2t; A2t\Tn, B", B^t, C", C2) 

= I{Tn , -B" , i?24 , C" , 6*24 , ; ^24 1 6*24 ) 

A2t|i?2t, r„, B", B2f , C", C^). (16) 

From Eq. ([T]l, we have 

Cit ^ Bit ^ {Tn, Alt, B^^, Cit, Cit, CJ)- 

Thus, we have 

I{Bit \ Ait\Bit, T„, _B^j, C", C2) 

~ ^i^it' ^it' Ait\Bit,Tn, B^^, Cj^j, C]^, C2 ) 

> I{B^^;Ait\Bit,Tn,B^^,Cit,Cit,C2)- (17) 

Similarly, from Eq. ([T), we have 

I{B2vA2t\B2t,Tn, Bi, B2t,Ci, C2) 

> I{Btt\A2t\B2t,Tn,Bi,B2t,Ci,C2t,C2t)- (18) 

Finally, by substituting Eqs. ([TSTl-lfTSTl into Eq. (fT4l i. we have 

n 

> ^[/(r„,i?r„cr„c+,C2";Ait|CiO 

t=l 

^2 ; ^itl^it> r„, C;^j, c^, C2 ) 

-S", i?2t, C*", C2J, C^; A2t\C2t) 
+HBtu A2t\B2t,Tn, Bi, i?2t, C", C2J, C^)] . 
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